NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Training Robust ML-based Raw-Binary Malware Detectors in Hours, not Months

https://doi.org/10.1145/3658644.3690208

Lucas, Keane; Lin, Weiran; Bauer, Lujo; Reiter, Michael K; Sharif, Mahmood (December 2024, ACM)

Full Text Available
Group-based Robustness: A General Framework for Customized Robustness in the Real World

https://doi.org/10.14722/ndss.2024.24084

Lin, Weiran; Lucas, Keane; Eyal, Neo; Bauer, Lujo; Reiter, Michael K.; Sharif, Mahmood (February 2024, Network and Distributed System Security Symposium)

Machine-learning models are known to be vulnerable to evasion attacks, which perturb model inputs to induce misclassifications. In this work, we identify real-world scenarios where the threat cannot be assessed accurately by existing attacks. Specifically, we find that conventional metrics measuring targeted and untargeted robustness do not appropriately reflect a model’s ability to withstand attacks from one set of source classes to another set of target classes. To address the shortcomings of existing methods, we formally define a new metric, termed group-based robustness, that complements existing metrics and is better suited for evaluating model performance in certain attack scenarios. We show empirically that group-based robustness allows us to distinguish between machine-learning models’ vulnerability against specific threat models in situations where traditional robustness metrics do not apply. Moreover, to measure group-based robustness efficiently and accurately, we 1) propose two loss functions and 2) identify three new attack strategies. We show empirically that, with comparable success rates, finding evasive samples using our new loss functions saves computation by a factor as large as the number of targeted classes, and that finding evasive samples, using our new attack strategies, saves time by up to 99% compared to brute-force search methods. Finally, we propose a defense method that increases group-based robustness by up to 3.52 times.
more » « less
Full Text Available
Adversarial Training for Raw-Binary Malware Classifiers

Lucas, Keane; Pai, Samruddhi; Lin, Weiran; Bauer, Lujo; Reiter, Michael K.; Sharif, Mahmood (August 2023, 32nd USENIX Security Symposium)

Machine learning (ML) models have shown promise in classifying raw executable files (binaries) as malicious or benign with high accuracy. This has led to the increasing influence of ML-based classification methods in academic and real-world malware detection, a critical tool in cybersecurity. However, previous work provoked caution by creating variants of malicious binaries, referred to as adversarial examples, that are transformed in a functionality-preserving way to evade detection. In this work, we investigate the effectiveness of using adversarial training methods to create malware-classification models that are more robust to some state-of-the-art attacks. To train our most robust models, we significantly increase the efficiency and scale of creating adversarial examples to make adversarial training practical, which has not been done before in raw-binary malware detectors. We then analyze the effects of varying the length of adversarial training, as well as analyze the effects of training with various types of attacks. We find that data augmentation does not deter state-of-the-art attacks, but that using a generic gradient-guided method, used in other discrete domains, does improve robustness. We also show that in most cases, models can be made more robust to malware-domain attacks by adversarially training them with lower-effort versions of the same attack. In the best case, we reduce one state-of-the-art attack’s success rate from 90% to 5%. We also find that training with some types of attacks can increase robustness to other types of attacks. Finally, we discuss insights gained from our results, and how they can be used to more effectively train robust malware detectors.
more » « less
Full Text Available
Adversarial training for raw-binary malware classifiers

Lucas, Keane; Pai, Samruddhi; Lin, Weiran; Bauer, Lujo; Reiter, Michael K.; Sharif, Mahmood (August 2023, USENIX Security Symposium)

Machine learning (ML) models have shown promise in classifying raw executable files (binaries) as malicious or benign with high accuracy. This has led to the increasing influence of ML-based classification methods in academic and real-world malware detection, a critical tool in cybersecurity. However, previous work provoked caution by creating variants of malicious binaries, referred to as adversarial examples, that are transformed in a functionality-preserving way to evade detection. In this work, we investigate the effectiveness of using adversarial training methods to create malware-classification models that are more robust to some state-of-the-art attacks. To train our most robust models, we significantly increase the efficiency and scale of creating adversarial examples to make adversarial training practical, which has not been done before in raw-binary malware detectors. We then analyze the effects of varying the length of adversarial training, as well as analyze the effects of training with various types of attacks. We find that data augmentation does not deter state-of-the-art attacks, but that using a generic gradient-guided method, used in other discrete domains, does improve robustness. We also show that in most cases, models can be made more robust to malware-domain attacks by adversarially training them with lower-effort versions of the same attack. In the best case, we reduce one state-of-the-art attack’s success rate from 90% to 5%. We also find that training with some types of attacks can increase robustness to other types of attacks. Finally, we discuss insights gained from our results, and how they can be used to more effectively train robust malware detectors.
more » « less
Full Text Available
Towards Usable Security Analysis Tools for Trigger-Action Programming

McCall, McKenna; Zeng, Eric; Shezan, Faysal Hossain; Yang, Mitchell; Bauer, Lujo Bauer; Bichhawat, Abhishek; =Cobb, Camille; Jia, Limin; Tian, Yuan (August 2023, USENIX Association)

Research has shown that trigger-action programming (TAP) is an intuitive way to automate smart home IoT devices, but can also lead to undesirable behaviors. For instance, if two TAP rules have the same trigger condition, but one locks a door while the other unlocks it, the user may believe the door is locked when it is not. Researchers have developed tools to identify buggy or undesirable TAP programs, but little work investigates the usability of the different user-interaction approaches implemented by the various tools. This paper describes an exploratory study of the usability and utility of techniques proposed by TAP security analysis tools. We surveyed 447 Prolific users to evaluate their ability to write declarative policies, identify undesirable patterns in TAP rules (anti-patterns), and correct TAP program errors, as well as to understand whether proposed tools align with users’ needs. We find considerable variation in participants’ success rates writing policies and identifying anti-patterns. For some scenarios over 90% of participants wrote an appropriate policy, while for others nobody was successful. We also find that participants did not necessarily perceive the TAP anti-patterns flagged by tools as undesirable. Our work provides insight into real smart-home users’ goals, highlights the importance of more rigorous evaluation of users’ needs and usability issues when designing TAP security tools, and provides guidance to future tool development and TAP research.
more » « less
Full Text Available
Constrained Gradient Descent: A Powerful and Principled Evasion Attack Against Neural Networks

Lin, Weiran; Lucas, Keane; Bauer, Lujo; Reiter, Michael K.; Sharif, Mahmood (July 2022, The 39th International Conference on Machine Learning)

Full Text Available
Constrained Gradient Descent: A Powerful and Principled Evasion Attack Against Neural Networks

Lin, Weiran; Lucas, Keane; Bauer, Lujo; Reiter, Michael K.; Sharif, Mahmood (July 2022, Proceedings of Machine Learning Research)

Full Text Available
Malware Makeover: Breaking ML-based Static Analysis by Modifying Executable Bytes

https://doi.org/10.1145/3433210.3453086

Lucas, Keane; Sharif, Mahmood; Bauer, Lujo; Reiter, Michael K.; Shintre, Saurabh (June 2021, Proceedings of the ACM Asia Conference on Computer and Communications Security)

Motivated by the transformative impact of deep neural networks (DNNs) in various domains, researchers and anti-virus vendors have proposed DNNs for malware detection from raw bytes that do not require manual feature engineering. In this work, we propose an attack that interweaves binary-diversification techniques and optimization frameworks to mislead such DNNs while preserving the functionality of binaries. Unlike prior attacks, ours manipulates instructions that are a functional part of the binary, which makes it particularly challenging to defend against. We evaluated our attack against three DNNs in white- and black-box settings, and found that it often achieved success rates near 100%. Moreover, we found that our attack can fool some commercial anti-viruses, in certain cases with a success rate of 85%. We explored several defenses, both new and old, and identified some that can foil over 80% of our evasion attempts. However, these defenses may still be susceptible to evasion by attacks, and so we advocate for augmenting malware-detection systems with methods that do not rely on machine learning.
more » « less
Full Text Available
OmniCrawl: Comprehensive Measurement of Web Tracking With Real Desktop and Mobile Browsers

https://doi.org/10.2478/popets-2022-0012

Cassel, Darion; Lin, Su-Chin; Buraggina, Alessio; Wang, William; Zhang, Andrew; Bauer, Lujo; Hsiao, Hsu-Chun; Jia, Limin; Libert, Timothy (November 2021, Proceedings on Privacy Enhancing Technologies)

Abstract Over half of all visits to websites now take place in a mobile browser, yet the majority of web privacy studies take the vantage point of desktop browsers, use emulated mobile browsers, or focus on just a single mobile browser instead. In this paper, we present a comprehensive web-tracking measurement study on mobile browsers and privacy-focused mobile browsers. Our study leverages a new web measurement infrastructure, OmniCrawl, which we develop to drive browsers on desktop computers and smartphones located on two continents. We capture web tracking measurements using 42 different non-emulated browsers simultaneously. We find that the third-party advertising and tracking ecosystem of mobile browsers is more similar to that of desktop browsers than previous findings suggested. We study privacy-focused browsers and find their protections differ significantly and in general are less for lower-ranked sites. Our findings also show that common methodological choices made by web measurement studies, such as the use of emulated mobile browsers and Selenium, can lead to website behavior that deviates from what actual users experience.
more » « less
Full Text Available
What makes people install a COVID-19 contact-tracing app? Understanding the influence of app design and individual difference on contact-tracing app adoption intention

https://doi.org/10.1016/j.pmcj.2021.101439

Li, Tianshi; Cobb, Camille; Yang, Jackie; Baviskar, Sagar; Agarwal, Yuvraj; Li, Beibei; Bauer, Lujo; Hong, Jason I. (August 2021, Pervasive and Mobile Computing)

Full Text Available

« Prev Next »

Search for: All records